CMP Zip Code Analysis
Or: Where Do Our Guests Come From?
The Report
Summary
Where The Data Comes From
With everybody’s help we got access to three sources of zip code data:
CMP membership as of 2023-04-18
~1,500 datum
A snapshot as of 2023-04-18 of every membership with a registered zip code
Exit Survey responses from (2021-11-23 to 2023-03-28)
~750 datum
Sales for an entire group who responded to an exit survey
- Also covers a bevy of additional information that can be utilized alongside this
Sales Data (2021-06-01 to 2023-04-30)
~80,000 datum
Aggregated total admissions from each zip code in the time frame
What We Have Done
We’ve cleaned and aggregated the data all the zip code data sources into a single data frame, summarized it with some basic metric analysis. Additionally we gathered zip code data from the US Census combined that into the data frame we made. The metrics we used were:
Count (“How many in each zip code?”)
Percent of Sales (“What proportion of our sales does this zip code account for?”)
Percent of Population Served (“How much of the population of that zip code have we sold to?”)
Density (“How many people per square mile are in that zip code?”)
You can see these in the data table and the plots provided. Using these calculated metrics and some descriptive ones we’ve also narrowed down some probable locations where marketing might be most effective.
Some of the data is inflated because of the use of ‘15212’ as ‘unknown’ in that data set. We tried to reduce it down to what we think is accurate (based of percentage of responses in the exit survey) but we cannot be sure. By adding in the exit survey and membership data we are adding a slight weight to those we know are accurate.
The Results
From Far to Near
We have guests come in from 43 states (Alaska and Hawaii not shown) and the District of Columbia! Of course the vast majority came from Pennsylvania or our neighboring states (West Virginia, Virginia, Ohio, and New York).
In Pennsylvania we hit 48 out of 67 counties! PA Guests represent 96% of our total guest pool. Those within a one hour drive of CMP represent 98% of our guests.
Recommended Zip Codes To Target
Below is a table with the top 20 zip codes for marketing I would recommend to look into. These were determined by fitting the following criteria:
Is within Pennsylvania
Within ~1 hour drive of CMP
The area has less than 5% of the population served already
Has a population density > 1000
Has a population > 5000
This is to provide regions most likely to be able to go to CMP easily and have a large amount of people (who haven’t all been to CMP already) in it in a smaller space so the advertising can be the biggest bang for the buck!
| Recommended Zip Codes For Marketing | |||||
| Based on Location, Population Density, and Percent of Population Served | |||||
| Zip Code | Percent of Population Served | City | County | Population Size | Population Density |
|---|---|---|---|---|---|
| 16146 | 0.11 | Sharon | Mercer | 13222 | 1281.0 |
| 15104 | 0.37 | Braddock | Allegheny | 8098 | 1267.3 |
| 15132 | 0.62 | Mckeesport | Allegheny | 19591 | 1343.1 |
| 15227 | 1.68 | Pittsburgh | Allegheny | 29631 | 1871.3 |
| 15210 | 1.99 | Pittsburgh | Allegheny | 26964 | 2212.4 |
| 15236 | 2.01 | Pittsburgh | Allegheny | 30938 | 1115.3 |
| 15120 | 2.17 | Homestead | Allegheny | 18332 | 1509.8 |
| 15204 | 2.24 | Pittsburgh | Allegheny | 8124 | 1668.2 |
| 15102 | 2.52 | Bethel Park | Allegheny | 30765 | 1118.1 |
| 15223 | 2.62 | Pittsburgh | Allegheny | 7062 | 1463.7 |
| 15220 | 3.24 | Pittsburgh | Allegheny | 18313 | 1495.3 |
| 15219 | 3.31 | Pittsburgh | Allegheny | 10453 | 1834.0 |
| 15211 | 3.59 | Pittsburgh | Allegheny | 10512 | 2562.2 |
| 15221 | 3.71 | Pittsburgh | Allegheny | 29264 | 1819.3 |
| 15145 | 3.84 | Turtle Creek | Allegheny | 6764 | 1362.5 |
| 15232 | 3.85 | Pittsburgh | Allegheny | 11905 | 5618.3 |
| 15224 | 4.11 | Pittsburgh | Allegheny | 10916 | 2761.8 |
| 15213 | 4.30 | Pittsburgh | Allegheny | 28172 | 4457.8 |
| 15203 | 4.42 | Pittsburgh | Allegheny | 10207 | 2656.1 |
| 15234 | 4.55 | Pittsburgh | Allegheny | 14006 | 1737.1 |
Looking Forward
What We Need To Do In The Future
In the end the most important thing we need to do is change how we enter in zip code information at the front desk for guests who don’t provide one. Currently the go-to is to enter in “15212” (CMP’s zip code); that presents a problem because we no longer have an accurate representation of who is coming to the Museum from the North Side and that is the most important region we need to have an accurate count for. If we are under-serving the people where this Museum resides we need to remedy that.
In talking to some of the floor people for visitor services and looking over the sales data sent to me it seems like having “00000” be the “no zip code provided” code is the best option for the following reasons:
It is easy to remember and enter for staff.
It corresponds to no geographic area and is easy to parse out when needed.
We already know the reporting system can handle it in a preferable way because I have seen it in the data I received.
There were also so little entered into the system that an accidental entering of it is unlikely (Only 2 admissions attributed to it out of 171621 admissions total in the uncleaned sales data)
We need the sales data to be the ground truth here and so we need the most accurate representation we can. If we could 100% trust the data coming off of sales data we can see if and how membership and exit survey respondents are skewed from who is coming in.
What We Can Do In The Future
Coordinate with Marketing to see how geographically-bound ads effected guest turnout from the region (and other metrics)
See how representative we are to the specific areas we serve and how our exit survey & membership data are skewed (requires accurate reporting of all zip codes)
Incorporate estimated demographics (race/ethnicity and household)
With updating information we can map how our guest turnout changes over time
Additionally I can set up a pipeline to have the data be easy to update from each team and show the analysis without needing to redo most of this work every time – or without necessarily needing outside help (i.e. me) every time.
Explorable Data
In the interest of transparency I’ve also included explorable data sets for both the plot and the cleaned/summarized table! These are only viewable if you download the file and open it in your browser instead of looking at it through Sharepoint, Teams, or however else it has been shared with you.
Below you’ll find an explorable map with all the data we have for the state of Pennsylvania. You’ll be able to zoom in and out of the map as well as move across the state. Clicking any of the highlighted zip code regions will bring up some of the information we have on it. The shading of the region denotes the percentage of the population served. Regions with red circles in them denote one of the recommend areas.
And now below we have the full summarized & cleaned data set. You’ll be able to sort, filter, and paginate through it to your heart’s content. You’ll also be able to download this as a raw csv or excel worksheet and do your own playing with the data if you wish!
Any additional comments, questions, concerns, or follow-ups you wish to see can be sent to Nour al-Zaghloul at L&R (nal-zaghloul@pittsburghkids.org) who will be happy to talk to you about any of the above!